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Abstract. The feasibility of using artificial neu- 
ral networks as control systems for modern, complex 
aerospace vehicles is investigated via an example air- 
craft control design study. The problem considered 
is that of designing a controller for an integrated air- 
frame/propulsion longitudinal dynamics model of a 
modern fighter aircraft to provide independent con- 
trol of pitch rate and airspeed responses to pilot com- 
mand inputs. An explicit model-following controller 
using Hoc control design techniques is first designed 
to gain insight into the control problem as well as 
to provide a baseline for evaluation of the neurocon- 
troller. Using the model of the desired dynamics as a 
command generator, a multilayer feedforward neural 
network is trained to control the vehicle model within 
the physical limitations of the actuator dynamics. 
This is achieved by minimizing an objective function 
which is a weighted sum of tracking errors and con- 
trol input commands and rates. To gain insight in the 
neurocontrol, linearized representations of the neuro- 
controller are analyzed along a commanded trajec- 
tory. Linear robustness analysis tools are then 
plied to the linearized neurocontroller models and to 
the baseline E 0 c based controller. Future areas of 
research are identified to enhance the practical appli- 
cability of neural networks to flight control design. 

1 Introduction. In the past few years, there 
has been an increasing interest in the control com- 
munity to exploit the promise of artificial neural net- 
works to solve difficult control problems. However, 
most of the neural network applications to control de- 
sign that have appeared in the literature [1,2], either 
dealt with robotic systems, or with control problems 
that are mainly of academic interest such as the in- 
verted pendulum problem. Only more recently have 
neural networks been applied to the control design 
of more complex problems, e.g. manufacturing pro- 
cess [3]. The objective of this paper is to investigate 
the applicability of neural networks as controllers for 
aerospace vehicles with special emphasis on piloted 


flight. Towards this objective, results are presented 
from a preliminary study of neurocontrol design for 
an integrated airframe/propulsion model of a mod- 
ern fighter aircraft for the piloted longitudinal land- 
ing task. To gain insight in the characteristics of 
the neurocontroller, linear analysis tools are applied 
to linearized representations of the neurocontroller 
and to a baseline E » based controller. Closed loop 
system performance and robustness of the neurocon- 
troller are evaluated and discussed in relation to the 
Hoc based controller. 

The paper is organized as follows. The vehicle 
model and the desired closed-loop dynamics are first 
discussed, and an explicit model-following E oc based 
control design is presented. The architecture used to 
train the neurocontroller is then presented and the 
results of the neurocontroller are evaluated. A per- 
formance and robustness analysis is then presented 
for the neurocontroller and the E oc based controller. 

2 Vehicle Model. The vehicle model con- 
sists of an integrated airframe and propulsion system 
state-space representation for a modern fighter air- 
craft powered by a two-spool turbofan engine and 
equipped with a two-dimensional thrust-vectoring 
and reversing nozzle. 

The flight condition used in this application is rep- 
resentative of the STOL (Short Take-off and Land- 
ing) approach-to-landing task, with an airspeed of 
V 0 = 120 Knots, a flight path angle of 70 = -3 deg, 
and a pitch attitude of 80 = 7 deg. The linearized 
dynamics of the vehicle model are of the form 

i = z = Cx; (1) 

where the state vector is 

x = [u,w,Q,6,h,N2,N2b,P6,T41B} T , (2) 

with 

xi — aircraft body apus forward velocity (ft/sec) 
w = aircraft body axis vertical velocity (ft/sec) 
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Q = aircraft pitch rate (rads/sec) 

6 — pitch angle (rads) 
h = altitude (ft) 

N2 = engine fan speed (rpm) 

N25 = core compressor speed (rpm) 

P6 = engine mixing plane pressure (psia) 

T 41 B = engine high pressure turbine blade 
temperature (°R), 
and the control input vector is 

U„ = [WF, STVf ; (3) 

with 

WF = engine main burner fuel flow rate ( Ibm/hr) 
6TV = nozsie thrust vectoring angle (deg). 

The vehicle outputs to be controlled are 

z = [V,Q} T , (4) 

where V is the aircraft velocity in ft/sec, and Q is 
the pitch rate in deg/s. The system matrices A , i?, 
and C are available in Ref. [4]. The open-loop vehicle 
eigenvalues are: 

Ai = 0.07, A 2f3 = -0.09 ± ;0.23, A 4 = 1.06 , 


Table 1: Desired Response Transfer 

FuactigflBi 


Notation: {i(l/r)/[{; w ft ) = 

k (s + 1 A)/(s 2 4 2{o> n s 4 a> n 2 )} 

= 0.04(3.13 ) _ „ 

Vjjp 1 , [0.89;0.36j Qsel 


_ <J* _ 35.12(0.5) 

“ ~ [0,89:2.24] 


Actuator models were also used in control design 
and evaluation. The fuel flow actuator was modelled 
as 


Gwf( 8 ) = 


10 


50 


3 4 10 s 4 50* 


( 6 ) 


with a maximum fuel flow rate \WF\ max = 
10,000 Ibm/hr, and a rate limit \WF\max = 
20, OOOlbm/hr/ 3 . Note that the fuel flow here cor- 
responds to the perturbation from the trim value for 
the linear model. In this study, the value I WTLa* 
is therefore chosen such that the total fuel flow limit 
will not be exceeded when a perturbation of a magni- 
tude of WFmaz is commanded. The thrust vectoring 
actuator is modelled as 


Gstv(s) 


15 

34 15 ; 


(7) 


A 5 = -1.47 Airframe modes 

and 

A 6 = -1.40, A 7 = -3.57 , A 8 = -6.96, 

A9 = -89.28 Propulsion modes. 

Note that the airframe is statically unstable with a 
highly unstable pitch mode. Open loop analysis also 
indicated a strong coupling in the response of the 
controlled outputs z to control inputs u c . 

The control design objective is to design a control 
system that provides decoupled command tracking of 
velocity and pitch rate from pilot control inputs with 
aircraft responses compatible with Level I handling 
qualities requirements [5]. The desired response dy- 
namics are selected to be of the form 

— Am 4 Bm ^SEL > ~ G m X m ] (5) 

with zsel — T '} r SEL, Qsel ] 7 where Vsel is tht pilot 
velocity command in ft/s and Qsel ls pilot longi- 
tudinal stick deflection in inches, and z c = [V C} Q c ] , 
where the subscript u c ” refers to the ideal response 
in V and Q with units of ft/s and deg/s respectively. 
The system matrices A m , R m and C m are the state- 
space representation of the ideal response transfer 
functions listed in Table 1. 


with a maximum thrust vector angle |£TV| max = 
10 deg, and a rate limit |£7V| moi = 20 deg/s. 

As a result, nonlineariiies appear in the control de- 
sign and evaluation in the form of actuators position 
and rate limits. 

3 Hoc Control Design. Recent advances in 
Hoc control theory [6] and computational algorithms 
to solve for Hod optimal control laws [7] have en- 
abled the application of this theory to practical com- 
plex multivariable control design problems. Many ex- 
ample applications of Hoo based control designs for 
aerospace vehicles have appeared in recent literature 
[8-10]. Prior to applying a neural network approach 
to control design for the example vehicle under study, 
an Hoc based control law was obtained as a baseline 
for the performance and robustness analysis of the 
neurocontroller. 

Within the framework of Hoc optimisation, the 
control design problem for this example study was 
formulated as the model-following problem shown in 
Fig.l. The three transfer functions that are of in- 
terest for such a problem are the sensitivity func- 
tion 5(3), the complementary sensitivity function 
T(s), and the control transmission function C(s). 
These represent the transfer functions from the refer- 
ence commands to tracking errors, controlled vari- 
ables, and commanded control inputs respectively, 
i.e. e(s) = S(3)i <r (3), z(s) = T(s)z c (s) and u c (s) = 
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C(s)z c (s). In order to be able to influence both the 
low-frequency and high-frequency properties of the 
closed-loop system, it is desirable to find a controller 
K(s) which minimizes a weighted norm of a combi- 
nation of these three transfer functions, i.e.: 


mm||J?(jw)|| 0C with — 


W s (ju,).S(ju>) 
Wrtiu). T{Ju) 
W c {jw).C{jw) 


The weighting functions Ws(jw), Wr(jw) and 
Wc{jw) are the “knobs" used by the control designer 
to “tune” the controller K(a) such that the design 
objectives are met. For instance, choosing Ws to 
be large at low frequencies ensures good command 
tracking performance, and choosing Wj to be large at 
high frequencies ensures robustness to high frequency 
unmodelled dynamics. Wc is chosen to ensure that 
control actuation bandwidths, as well as rate and de- 
flection limits, are not exceeded in the control design. 

For the aircraft example, the integrated design 
model. P{s ), in Fig.l consisted of the vehicle model 
(1) and the actuator models (6) and (7). The ideal re- 
sponse model, R{s), in Fig.l consisted of the desired 
model dynamics (5) with a high pass filter ( #+ * ai ) on 
the pilot pitch rate command. This high pass filter 
is added to reflect the fact that pitch rate cannot be 
commanded in steady-state. The outputs z and the 
errors i were scaled by their approximate maximum 
values to be commanded by the pilot with V c ° = 20 
ft/sec and Q° c = 3 deg/sec. The sensitivity weights 
Ws and the complementary sensitivity weights W? 
were chosen as listed in Table 2. 


Table 2: Weights for H x Control Design 


Controlled 

W s 

W T 

Variable 



V 

33.50^1000 

0.22$ 

335.01# + 1 

0.0022# + 1 


6.70#-^1000 

0.044# 

Q 

67-02 i-r.l — 

0.00044# -t- 1 


This choice of Ws and Wr was based on the per- 
formance and robustness arguments discussed earlier. 
The weights Wc consisted of the control commands 
and rates weighted by the inverse of actuator position 
and rate limits for WF and 6TV listed earlier. Note 
that the combination of tracking errors i and aircraft 
outputs z is used as a controller input instead of e 
and ideal response, z c , to avoid control saturation 
due to large pilot inputs and undue amplification of 
inadvertent pilot command noise. 

The Hoc control design plant as discussed above 
is of 21st order consisting of the 9th order aircraft 
model. 2nd order WF actuator model. 1st order 6TV 


actuator model, 5th order ideal response model, and 
1st order Ws and Wt for the two controlled vari- 
ables. The resulting 21st order Hoc optimal controller 
obtained using the solution algorithm of Ref. [6] was 
reduced to 13th order by residualizing the high or- 
der modes. The maximum eigenvalue of the reduced 
order controller is |A| max = 6.83rads/sec, which im- 
plies that the controller can be implemented digitally 
with reasonable sampling rates. With this reduced- 
order controller, the performance results in terms of 
closed-loop response, control effort and control rate 
requirements, are shown in Figs. 2 and 3 for two cases 
of pilot command inputs: (1) Vs el = — 20ft/s for 
t > 0, Qsel = 0.5in for 0 < t < 3sec and Qsel = Oin 
for t > 3 sec; (2) Vsel = 20ft/s for t > 0 and Qsel 
same as for command input case 1. From Fig. 2, we 
note that for the pilot command input in case 1 the 
velocity response obtained with the controller is quite 
close to the ideal response, and the control input com- 
mands and rates are reasonable. For the pilot com- 
mand input in case 2, the pitch rate response is quite 
similar to that for case 1; however, the velocity re- 
sponse is degraded from the ideal response. Case 2 
is demanding in that the pilot is commanding the 
aircraft to pitch up as well as accelerate to a higher 
velocity. As seen in Fig. 3, the maximum fuel flow 
rate is commanded by the controller for an extended 
period of time in order to track the ideal response. 
Note that the closed-loop system remains stable in 
the presence of the actuator limits, and the aircraft 
response tracks the ideal response in the steady-state. 

4 Neurocontrol Design. Although the 
strength of neural networks lies in their ability to 
handle nonlinearities in the controlled dynamics, the 
control design for a linear aircraft model is being con- 
sidered in this paper to gain insight into the neu- 
ral network characteristics by using linear analysis 
tools. As discussed earlier, nonlinearities of concern 
for practical control design, such as actuator position 
and rate limits, are included in the design criteria. 

The architecture for training the neurocontroller is 
shown in detail in Fig. 4. For each pilot selected tra- 
jectory zsel{1)> a commanded trajectory z c (t) is gen- 
erated from (5). Prior to training, the commanded 
variables z c (t) are discretized and scaled to us_ 

ing the same scaling as for the floe design. Like- 
wise, the dynamics of the actuators and of the vehi- 
cle model are discretized and scaled after normaliz- 
ing the control input vector by its maximum value 
(iWFj mat ,|67V| ma *). As for the fl* design, the 
tracking error at time t* is the error between the 
s c<iied vehicle output vector and its desired scaled 
value at the same time t*, i.e. e 2 (tk) = )“**(**)■ 

However, because of the time-discretization of the ac- 
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tuator dynamics and vehicle model dynamics within 
the training loop, a commanded control input vector 
generated at time £* by the neurocontroller will only 
affect the aircraft output at time i*+ 2 - Consequently, 
the tracking error at time £*+ 2 defines the magnitudes 
of the weights increments at time £*. Said in another 
way, due to the time-discretization of the dynamics, 
the internal representation of the neurocontroller has 
to be updated at time £* on the basis of information 
which will be only available at a later time £*+ 2 - To 
be consistent with the time-discretized design, knowl- 
edge of the anticipated commanded vehicle ouput at 
time ifc+ 2 , £'(£*+ 2 ), is explicitly provided to the neu- 
ral network at time £* during training by means of 
the commanded error £,(£*) = £'(t*+2)“£ f (**)• This 
procedure ensures that the proper action will be com- 
manded by the neurocontroller at time £* to achieve 
the desired tracking at time £*+ 2 during training. 
When operating the trained neural network in closed- 
loop however, the tracking error e,(£*) will be used 
as input to the neurocontroller instead of the com- 
manded error ? f (tjt) which is not available in the real 
simulation because it requires knowledge of future pi- 
lot command inputs. This means that the trained 
neural network will be tracking the exact commanded 
trajectory with a two-step time delay during simula- 
tion evaluation. Since the neurocontrolier operates in 
the continuous time domain, this two-step time de- 
lay should not adversely affect performance in closed- 
loop evaluation. That such is the case was confirmed 
by the closed-loop evaluation results to be presented 
later. 

As shown in Fig. 4, the two commanded control in- 
puts are calculated by a two hidden-layer feedforward 
neural network with eight input units (or four pairs of 
fan-out units associated to the Q and V" variables), 
and two neurons in the output layer. These pairs 
consist of the scaled output vector £'(£*); the com- 
manded error i x (tk) between the scaled vehicle out- 
put vector at time £* and its desired scaled value at 
time £*+ 2 ; the discrete time-derivative of the track- 
ing error, £,(£*); and the time-average of the track- 
ing error, l/tkj*^ e z (t)dt. As in the design, the 
motivation behind using the combination of £'(£*) 
and £ x (t*) as inputs to the neurocontrolier, instead of 
V(t k ) and £'(£*+ 2 )* is to allow the neural network to 
reconstruct the command without direct feedforward 
of the command. The role of the error rates £*(£*) is 
to provide the neural network with lead information, 
and the time-averaged error feedback l/tkj 0 k e z (t)di 
is to minimize the steady-state tracking error for step 
command inputs. (The motivation behind scaling the 
integral error /‘ h e 1 (£)dt into its time-average was to 
improve backpropagation learning by bounding the 


corresponding input to the neural network. Other 
alternatives would be to low-pass filter the integral 
error itself, or to remove the scaling factor 1/t* from 
the time-averaged error as learning takes place. Be- 
cause of their potential to improve steady-state track- 
ing, these latter approaches should be considered in 
future neurocontrol designs.) In Fig.4, the symbol A 
represents a latch that is clocked every St seconds to 
update the inputs to the neurocontrolier, the actua- 
tors and the vehicle model. A network configuration 
of 15 neurons in the first hidden layer, and 10 neurons 
in the second hidden layer, is chosen for the neuro- 
controller. Each neuron of the neurocontrolier has 
the activation function; 


y = tanh(x)\ 


(9) 


which limits its output y to the interval [-1, +1] for 
any input signal x. For a given set of weights of 
the neural network, the two output neurons yield the 
normalized commanded control input vector 


*;(**)=[ 


WF e 


STV c 


I WF\ max ' \6TV\ r 


( 10 ) 


which is applied to the Bcaled actuators. After a small 
time-interval St = tk+i — £*, the actuators yield the 
normalized actuator control output vector £*(£*+ 1 ) 
as t efined by (6) and (7). The normalized actuator 
control output vector u'(£*+i) is subsequently ap- 
plied as input to the scaled vehicle model over the 
time-interval [£*^ 1 , £*+ 2 ], and changes the state vec- 
tor of the vehicle model from x(£*+i) to x(£jb+ 2 )* In 
order to maximize the tracking performance while 
minimizing the costs associated with high control ef- 
fort and high control rate requirements, the neural 
network is trained to minimize an objective function 
that includes tracking errors, control effort and con- 
trol rate requirements 


J{tk) = (**+2)-?*«*(tfc+2) + 

T (f*+i)*^*££(*Jfc+i) + u 9 a T (£* + i)./i.u a (£k+i) ) 

( 11 ) 

where £*(£*+ 2 ) is the error between the scaled com- 
manded vector £'(£*+ 2 ) and the scaled vehicle output 
£'(**+ 2 )- The matrices p, A and p are 2x2 diago- 
nal matrices whose coefficients can be adapted so as 
to modify the characteristics of the neurocontrolier 
in order to achieve a practical performance/control- 
effort trade-off. Expression (11) is of the same form 
as the objective function used in Ref. [11] to design 
a neurocontrolier for the same airframe/propulsion 
system, but without simulating the actuator dynam- 
ics within the training loop. In Ref. [11], it was found 
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that training the neural network to minimize only the 
tracking error led to high control effort and high con- 
trol rate requirements. When the actuator dynamics 
were included in the closed-loop evaluation, this re- 
sulted in a highly oscillatory pitch rate response and 
a limit cycle behavior in velocity /fuel-flow response. 
However, a satisfactory trade-off between tracking 
performance and control effort could be achieved with 
finite values of A and p. Since the bandwidth limiting 
effect of the actuators is now explicitly taken into ac- 
count within the training loop, much improvement in 
performance /control-effort trade-off is expected from 
the minimization of (11). 

The backpropagation algorithm [12] was used to 
find the set of weights of the neurocontroller which 
minimize the objective function (11) over the set of 
pilot input commands. In order to backpropagate 
(11), a single layer feedforward neural network (per- 
ceptron) was used in place of the vehicle model in the 
training architecture of Fig. 4. This neural network 
emulator had 11 input units (corresponding to the 
two normalized actuator control outputs and to the 
nine state variables of the vehicle model), and 9 lin- 
ear output neurons (corresponding to the nine state 
variables of the vehicle model). Likewise, two feedfor- 
ward neural networks were used to emulate the dis- 
cretized dynamics of the actuators. The second-order 
dynamics of the fuel flow actuator were simulated by 
a three-layer network of linear and linear-thresholding 
neurons. As shown in Fig. 5, constraining fuel flow ef- 
fort and fuel flow rate requirements is achieved by 
thresholding the linear neurons of the two last lay- 
ers. The first-order dynamics of the thrust vectoring 
actuator were simulated by the two-layer neural net- 
work shown in Fig. 6. Constraining the effort and 
rate requirements of the thrust vectoring actuator is 
achieved by means of linear-thresholding neurons. 

The layers of an (A T «f l)-layer neural network can 
be labeled by an index p from 0 to A", p = 0 de- 
noting the input layer. Layer p has u(p) elements 
consisting of ii/(p) — 1] neurons and one unit that is 
permanently “on 71 and used to define the thresholds 
of the neurons of the (p + l) th layer. With symmet- 
ric activation functions of the type (9), the threshold 
of a neuron is defined as the value of its input signal 
above which its output is positive, and below which 
its output is negative. During training, the thresh- 
olds are updated with backpropagation in a manner 
similar to the updating of the weights [12]. 

The weight connecting the i th neuron of the p th 
layer to the j th neuron of the (p-j-l) tfc layer is 
denoted as The threshold of the j th 

neuron of the (p+l) t/l layer thus corresponds to 
u j'i(p+ 1 ):v(p)j»- For a singlc feedforward P 35 * 3 the 


neural network, a weight increment is given by 


):i*9 - Q< ^P A Ji<P+ 1) ( l2 ) 


where a is the steepest descent coefficient, Ot iP is 
the output of the i th neuron of the p th layer, and 
A Jt (p+i) is the effective error at the output of the 
j th neuron of the (p-f l) th layer. The effective errors 
Afc,( P +2) in the (p + 2) th layer are backpropagated to 
the (p-f l) th hidden layer to give the effective errors 
in the (p-f l) th layer, as 

A j\(p+i) = / / ( x i^(p+ 1 )) x ^(p* 1 ) 


with 

Si,(p+1) = Pfc=I +2 ^ A fc,(P+2) w l:,(p+2)y,(pfl)] (^) 

and where /'(*>, (p + 1 )) is the value of the derivative of 
the neural activation function for an input *j,( P +i) of 
the j ih neuron of the (p + l) tA layer. In the output 
layer, the effective errors A : ,n are the gradients of 
the objective function (11) 




dJ 

dOjj sT 


(14) 


Whenever the neural activation is not differentiable 
over the range of il\ possible neuron input values (as 
is the case for the linear-thresholding neurons used for 
emulating the actuators), ft should be constructed to 
preserve the characteristics of a monotonous contin- 
uous function. For example, the linear-thresholding 
activation function which is defined as 


fith( x ) — x */ l x l — 1 i 

/itfc(x) = 1 if x > 1 , 
fuh(x ) = -1 if X<-1 . (15) 

is clearly not differentiable over [— 00 , + 00 ]. Since 
fith is piecewise differentiable, it would seem a-priori 
natural to define fun 1 as = 1 ^ \ x \ < 1* and 

f uh t(x) = 0 if |*| > 1. With this definition of f lth i 
however, any time a neuron input xo would take a 
value outside of [-1, + 1] during training, the neuron 
output would remain trapped to 1, if xo > 1, or -1, if 
x 0 < - 1 . For such neuron input values, the weights 
of the incoming connections would remain frozen, and 
this would bias the learning. In order to permit the 
neurons full access to the output state space during 
training, fuh f is thus defined as 

fuh'ixi'p) = 1 if |z«,p| <1 or if x t ,p.5.,p < 0, 

fithi(xi, p ) = 0 otherwise. (16) 
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which will ensure that the weights be properly in- 
cremented during training. which appears in 

(16) is defined in (13). The serial arrangement of the 
neurocontroller, the neuro-emulator of the actuators, 
and the neuro-emulator of the vehicle model, consti- 
tutes a larger neural network through which the ob- 
jective function (11), J(tk), can be backpropagated 
through time [2] using Eqs.(13)-(16). The connec- 
tions between neurocontrollers and neuro-emulators 
which were used as backpropagating channels are in- 
dicated in Fig. 7 over a period of three time-steps 6t , 
and the weights increments are calculated using (12). 

The commanded trajectories used to train the 
neural network were generated as follows. The pi- 
lot selected pitch rate was a doublet centered at a 
time t c between 2.5s and 5s, with the characteris- 
tics: QselW — Qo f° r f Qsel{1) = ~Qo 

for 2i c > t > t c ; QsEL{t) = 0 for t > 2t c . Note 
that Qsel corresponds to pilot longitudinal stick de- 
flection with units in inches. The pilot selected air- 
frame velocity was a step function characterized by 
VselW = 0 for t < 0 and VsblW = V 0 for t > 0. 
The maximum intensities |Qol l^ol of the ran- 
domly selected input commands were bounded by 
Qmax = 0.5 in and V ma x = 20 ft/s. This maxi- 
mum value of Qsel corresponds to a maximum pitch 
rate command of about 3 deg/sec. Random sets 
of input trajectories were generated from uniform 
distributions of Qoi t c and Vo over [— Q max i Qmai]> 
[2.5s, 5s] and '-V mazt V max ] respectively. The com- 
manded variables Q c (t) and V c {t) were filtered from 
QsEi{t) and over a period of 12s with a 

time-step 6t = 0.02s. These types of commanded 
trajectories represent typical pilot command inputs. 

Training was performed in two phases. In the 
gross-tuning phase of the training, a set of 4000 com- 
manded trajectories was randomly generated, and the 
synaptic weights were updated at every time i* = kbt 
after backpropagating J(t*)through the neural net- 
work. This was done once for each trajectory of the 
training data set with a steepest-descent coefficient 
a = 0.001. In the fine-tuning phase of the train- 
ing, the synaptic weights were updated following a 
moving-window scheme: at every time t*, the weights 
were incremented after backpropagating through the 
neural network the time-integral of the objective 
function calculated over sampled points or dur- 
ing a period of seconds, i.e. As 

the width of the moving window was progressively in- 
creased to cover an entire commanded trajectory, i.e. 
nu = 12sec/0.02sec = 600, the steepest descent co- 
efficient a was progressively reduced from the initial 
value of 0.001 to 0.0001. In total, the neurocontroller 
was trained with approximately 10.000 commanded 


trajectories. 

5 Neurocontrol Performance, The eval- 
uation architecture of the neurocontroller in closed- 
loop is shown in Figure 8. The neurocontroller was 
tested on step pitch rate input commands, different 
from the doublets used in training. The input com- 
mands chosen to illustrate the neurocontrol perfor- 
mance were defined by the step pitch rate command 
QsEL(t) = 0.5in for t < 3sec, Qsel{ 0 = 0 for 
t > 3sec; applied simultaneously with one of the fol- 
lowing classes of step velocity commands: V$EL{t > 
0) = -20ft/sec (case 1); V S el{ t > 0) = 20ft/sec 
(case 2). 

When training the neural network without giving 
any consideration to the cost associated with large 
control efforts and large control rates, i.e. A = fi = 0 
in Eq.(ll), the neurocontroller learns very satisfac- 
torily to track the commanded outputs. However, 
the fuel flow is quite irregular, and both control in- 
put commands generated by the neurocontroller ride 
the actuator rate limits. A study of the trade-off 
between tracking performance and control effort re- 
quirement was conducted by training the neural net- 
work with A and fi of the form A = diag[XwF , A $tv] 
and fi = di&gfaw f ) Mwvjj with the same training 
characteristics an d the same matrix elements of p 
used earlier. As in Ref. [11], the tracking error is found 
to actually decrease for small increases in values of A 
and fi. 

The results from this trade-off study are shown in 
Figs. 9- 10 for cases 1 and 2 with the choice of param- 
eters p = diag[pv , Pq] = diap[2000, 20], A = 0.01, 
/x — 0.1. The pitch rate response follows the com- 
manded trajectory very smoothly, in spite of the 
thrust vectoring requirement STV reaching the ac- 
tuator rate limit at the initiation and end of the 
command. However, within the proposed training 
scheme, any attempt to lower the rate of thrust vec- 
toring by increasing perv resulted in a loss of track- 
ing performance. In case 1, neurocontrol is very sat- 
isfactory both in pitch rate and velocity response. In 
case 2, neurocontrol tracking is still very satisfactory 
in pitch rate response, but is slightly less satisfactory 
in velocity response owing to the physically demand- 
ing effort of increasing simultaneously aircraft speed 
and pitch angle. 

In order to estimate the effect of providing the 
neurocontroller with lead information during train- 
ing, the above process was repeated without feed- 
ing the discrete time-derivatives of the tracking er- 
ror, i.e. e x (ijt ) , to the neural network during train- 
ing. Without constraining control efforts and rates 
(A = fi = 0), the tracking performance deteriorated 
significantly with the appearance of some ringing in 
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the pitch rate response and a limit cycle behavior in 
the velocity /fuel-flow response. The fuel flow require- 
ment and fuel flow rate were both much more oscilla- 
tory than when lead information was provided to the 
neurocontroller during training. The fuel flow rate 
oscillated between the maximum and minimum rate 
limit during and beyond the 12 sec training period. A 
more oscillatory behavior was also noted for the con- 
trol effort and rate of the thrust vectoring. However, 
the situation improved significantly when constraints 
on control efforts and rates were applied during train- 
ing. In this case, a satisfactory trade-off between 
performance and control-effort was reached for val- 
ues of A and A in the vicinity of A wf = *stv = 0.02, 
p WF = 0.2 and ustv = 1.0. The results showed 
a similar velocity /fuel-flow response with and with- 
out lead information, but showed a noticeable degra- 
dation in the pitch-rate/thrust-vectoring response in 
comparison to the situation where lead information 
was provided to the neurocontroller. This degrada- 
tion in tracking performance resulted from the large 
value of the pitch rate constraint fxsTV (one order 
of magnitude larger than before), which was needed 
to decrease the tracking overshoots. In summary, 
lead information enabled the neurocontroller to over- 
come ringing and limit cycle behavior while increas- 
ing tracking performance. Thus, within the present 
scheme of neural computation, any dynamic char- 
acteristics required to achieve desirable performance 
had to be incorporated into the neural network with 
an appropriate choice of inputs. An extension of the 
present neural architecture to generate such dynamic 
characteristics could be a feedforward neural network 
with intermediate feedback inputs, i.e. a recurrent 
neural architecture as a dynamic neurocontroller. 

6 Analysis of the Controllers. From 
a comparison of the closed-loop response for the 
two command cases with the E 0 c based reduced or- 
der controller (Figs. 2 and 3) and the neurocontroller 
(Figs.9 and 10). it is evident that the neurocontroller 
provides improved command tracking although at the 
expense of increased control rate activity, both for 
6TV and WF. Also the pitch vectoring control re- 
quirements are higher and the fuel flow activity ex- 
hibits oscillatory behavior for the neurocontroller. 

Note that the results presented so fair have been 
with the nominal vehicle model used for control de- 
sign. Since this model is only a simplified version 
of the vehicle dynamics, an important criterion for 
design of controllers for flight vehicles is that of ro- 
bustness. Robustness is defined here as maintaining 
performance and stability in the presence of uncer- 
tainties associated with the modelling process. Mod- 
elling uncertainties are due to neglected high order 


dynamics, parameter changes due to change in flight 
conditions and the margin of error associated with es- 
timating model parameters based on analytical tools 
and experimental data. A classic specification for ro- 
bustness, also used in the military specifications for 
design of flight control systems [5], is that of stability 
margins, specifically gain and phase margin [14). The 
tools to determine these margins are fairly well devel- 
oped for linear systems - classical Bode analysis for 
single-input single-output systems [14] and modern 
singular value and structured singular value analysis 
for multi-input multi-output systems [15, 16]. For 
nonlinear systems, one way to determine robustness 
is to conduct Monte Carlo type simulations using all 
possible combinations of modelling uncertainties that 
can be expected. Another approach is to linearise the 
closed-loop system at various points along a given tra- 
jectory and then apply the linear analysis tools. The 
latter approach is less time consuming and provides 
more insight into the characteristics of the nonlinear 
system. Furthermore, this latter approach allows to 
perform a similar analysis for the linear S <*> based 
reduced order controller and the nonlinear neurocon- 
troller, for small perturbations along a given trajec- 
tory. 

Since the vehicle model used in this analysis is 
linear, only linear small perturbation models of the 
neurocontroller at different points along a given tra- 
jectory are needed to perform the type of robust- 
ness analysis discussed earlier. Considering the closed 
loop system response with the neurocontroller for the 
case 2 command inputs, corresponding to the results 
presented in Fig. 10, the linear neurocontroller models 
were generated at times t = 0.5, 2, 4, 6, 8 and 10 secs. 
The first three points in time correspond to tran- 
sient control activity whereas the last three represent 
steady-state type command tracking with monoton- 
ically decreasing tracking error. Note that the neu- 
rocontroller as shown in Fig. 8 consists of 4 sets of 
scaled (normalized) inputs: the time-averaged errors 
1 the error rates !*(t), the errors e x {t) and 

the controlled outputs £'(t). The scaling, the time- 
averaged error and derivative action were embedded 
within the neurocontroller during the linearization 
process to find a control structure consistent with 
the structure of the E 0 0 based controller which has 
only the errors (e) and the controlled outputs (z) as 
the inputs. The frequency response Bode plots of 
the linearized neurocontroller models were obtained 
to gain insight into the characteristics of the control 
action. Bode gain plots for the thrust vectoring angle 
(6TV) response to all the inputs to the controller lin- 
earized at t = 0.5sec are shown in Fig. 11. The Bode 
gain plots for the J?oc based controller are shown in 
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Fig. 12. An example variation in the neurocontroller 
characteristics with the change in magnitude of the 
inputs to the controller along the trajectory is shown 
in Fig. 13 in terms of the Bode gain plots for pitch rate 
error (e^) to thrust vectoring angle (<STT 7 ) response. 
Fig. 13 shows that the neurocontroller gains decrease 
with time. This type of behavior was exhibited by all 
the other input/output Bode plots of the linearized 
neurocontroller models. So in effect, the neurocon- 
troller can be thought of as a set of linear controllers 
with the controller parameters being a strong func- 
tion of the magnitude and direction (relative magni- 
tude) of the inputs to the controller. Note that since 
the Boo based controller is linear, its dynamics are 
independent of the magnitudes of the controller in- 
puts. 

From Fig. 11 we note that the neurocontroller ex- 
hibits PID (Proportional 4- Integral + Derivative) 
control type behavior from the error inputs ( tv and 
cq) to the thrust vectoring angle (6TV) output. This 
was also the case for the ev and eQ to WF response, 
and was true all along the trajectory as shown par- 
tially (for eQ input) by the plots in Fig. 13. This dy- 
namic behavior of the neurocontroller for the error 
inputs is directly due to allowing feedback of the in- 
tegral and derivative errors. Since no such dynamics 
were added to feedback of V and Q to the neurocon- 
troller, the neurocontroller exhibits only proportional 
type behavior from these inputs. 

Comparing Figs. 11 and 12, we first note that the 
magnitude of the ev and eQ to 6TV response is much 
lower for the Hoc based controller compared to the 
particular linearized neurocontroller models. This 
was also true for the error inputs to WF response. 
This result is a further confirmation that the con- 
trol effort and control rate requirements to track a 
given set of commands will be higher for the neuro- 
controller. Although the dynamic behavior of the Hoc 
based controller is more complex than the neurocon- 
troller, some integral and derivative action is evident 
in the eQ to STV response. The integral action was 
built into the Hoc based controller through the choice 
of the sensitivity weighting, however, unlike for the 
neurocontrol design the error rate information was 
not explicitly provided in the Hoc controller. The Hoc 
control synthesis procedure is such that it naturally 
builds in the amount of lead (error rate) information 
into the controller that is necessary to meet the con- 
trol design objectives specified through the weighted 
quantities. As evident from Figs. 11 and 12, the Hoc 
based controller provides lead at a lower frequency in 
the eQ to STV response as compared to the linearized 
neurocontroller. 

Another difference between the Hoc based con- 


troller and the neurocontroller is the compensation 
from the measurements of the controlled plant out- 
puts (V and Q) to the control inputs ( WF and STV). 
As mentioned earlier, this compensation is a “con- 
stant” (varying with input magnitude) gain from 
the controller inputs to outputs for the linearized 
neurocontroller. However, as seen from Fig. 12, the 
Hoc based controller has dynamics associated with 
this part of the control compensation and also has 
higher compensation gains than the linearized neu- 
rocontroller (Fig.ll). The controller structure used 
for the Boo and the neurocontrol design is consis- 
tent with the classical approach of flight control de- 
sign wherein an inner loop compensation (z — ► u) 
is designed first to provide stability augmentation 
and place the augmented plant dynamics within the 
handling qualities specifications; and then the outer 
loop compensation (e — * u) is designed to provide 
decoupled command tracking to reduce pilot work- 
load. The significance of the difference between the 
Boc based controller and neurocontroller “inner loop” 
compensation was studied further by considering fail- 
ures in the outer compensation loops, i.e failure in 
the error sensors. Eigenvalue analysis showed that 
the closed-loop system with Boc based controller will 
remain stable for failures in any or both of the error 
sensor loops whereas the closed-loop system with the 
neurocontroller linearized at t = 0.05 sec was unsta- 
ble for failure in either or both of the error loops. The 
response of the closed-loop system for case 2 com- 
mands and failure in the eQ loop is shown in Fig. 14 
for both the H x based controller and the nonlinear 
neurocontroller. The Boc based controller still tracks 
the velocity command and provides stable response in 
pitch rate whereas the neurocontroller gives a highly 
unstable response. So the Hoc based controller is 
using the plant measurements ( z ) in a manner con- 
sistent with the classical idea of providing inner loop 
plant augmentation. How to formulate the neurocon- 
trol design problem such that the resulting controller 
exploits the plant measurement information to pro- 
vide inner loop stability augmentation is an area of 
future research. 

Stability margin analysis was performed for the lin- 
earized neurocontroller models and the H 0 c based 
controller to quantify robustness of the control de- 
signs. Among the linearized neurocontroller models, 
stability margins were worst for the one linearized 
around t = 0.05 sec, so only those results are dis- 
cussed here. Structured singular value analysis [17] 
showed that the Boc based controller has guaranteed 
multivariable gain margins of -3.7 to 6.6 dB (gain fac- 
tor of 0.65 to 2.1) and phase margins of :t30 deg for 
simultaneous loop gain or phase changes at the plant 
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output (V and Q) and margins of -3.8 to 7.2 dB and 
±32.5 deg at the plant input (WF and 67V). For the 
linearized neurocontroller, these multivariable mar- 
gins were only -0.6 to 0.6 dB and ±3.4 deg for loop 
gain variations at the plant output, and -0.9 to 1.1 
dB and ±6.6 deg at the plant input. The low stabil- 
ity margins with the neurocontroller are indicative of 
poor robustness in that the closed loop system might 
be unstable for small uncertainties in the plant dy- 
namics. Since the multivariable margins can some- 
times be conservative, the stability robustness of the 
closed-loop system was further evaluated using the 
more classical approach of “breaking” one loop at a 
time, i.e. one loop open and other loops closed. This 
one-loop-at-a-time analysis confirmed the poor stabil- 
ity margins of the neurocontroller. The closed-loop 
response of the system with the H oc based controller 
and the nonlinear neurocontroller for an added delay 
of r d = 0.05 sec in the two control channels (IFF 
and 67V) is shown in Fig. 15. This value of r d cor- 
responds to a phase loss of 8 deg at a frequency of 3 
rads/sec, which is the frequency that corresponds to 
the guaranteed multivariable phase margin of 6.6 deg 
for the linearized neurocontroller, and it is quite rep- 
resentative of the kinds of time delays to be expected 
in practical implementation of complex flight control 
designs. From Fig. 15 we note that the H^ based 
control shows very little degradation in tracking per- 
formance in the presence of time delay, whereas the 
neurocontroller exhibits limit cycle behavior in the 
pitch controlled variable. A factor that may con- 
tribute to this lack of robustness is the fact that the 
neuro-command rides the thrust vectoring rate limit 
during initial and final transients. In contrast, the 
neuro-command is well below the fuel flow rate limit, 
which results in robust velocity tracking in the pres- 
ence of time-delay. Improving phase robustness char- 
acteristics of neurocontrollers and investigating their 
gain robustness characteristics are areas that warrant 
further study. 

In the neurocontrol design, the weights of the neu- 
ral network (the internal representation of the neu- 
rocontroller) were chosen to minimize the objective 
function (11) over an exhaustive set of pilot input 
commands to the nominal vehicle model by using 
the backpropagation algorithm. No information on 
modelling uncertainties and no constraint on “off- 
nominal” actuator dynamics were provided to the 
neural network during training. Without any con- 
straint other than control effort and rate limits, the 
trained neural network learned to control the nominal 
vehicle model as efficiently as possible (and within the 
resolution of backpropagation). Consequently, the ro- 
bustness of the neurocontroller as trained in section 4 


is mostly subject to the generalization ability of the 
backpropagation algorithm (in the present context, 
generalizing means providing stable control for “off- 
nominal” vehicle model dynamics that were not used 
during training ). Because backpropagation is known 
in general to have a limited ability to generalize [18], 
the robustness of the neurocontroller as trained in 
section 4 could have been expected to be quite lim- 
ited. 

Within the neural architecture of Fig.4, one possi- 
ble approach to enhance the robustness of the neu- 
rocontroller may be to include ail modelling uncer- 
tainties in the training data set. Another possibility 
might be to modify the objective function (11) used to 
train the neurocontroller to reflect some of the char- 
acteristics of the functional (8) which is minimized in 
the Hoc based control design. 

7 Conclusions. The applicability of neu- 
ral networks for flight control design was analyzed 
through the process of designing a model-following 
neurocontroller for the example of an integrated air- 
frame/propulsion model of a modern fighter aircraft 
for the piloted longitudinal landing task. For this 
two control inputs - two control outputs example, 
the control design problem was set up as the task of 
following the trajectories generated from a model of 
the desired vehicle response dynamics to pilot com- 
mand inputs. The neurocontroller was trained by 
simulating the non-linear dynamics of the actuators 
including position and rate limits. The choice of the 
objective function and its minimization over entire 
commanded trajectories were found to be critical to 
the neurocontrol design. A satisfactory trade-off be- 
tween tracking performance and control effort could 
be achieved by an appropriate selection of the weights 
of the objective function. 

The neurocontroller shows better performance 
than a baseline Hoc based controller designed for the 
same command tracking problem. However, the neu- 
rocontroller commands larger control rates than the 
Hoc based controller, specially for thrust vectoring 
where the neuro-command rides the thrust vectoring 
rate limit during initial and final transients. The pos- 
sibility of improving the practicality of the proposed 
neurocontrol design methodology, to prevent neuro- 
commands from riding actuator rate limits without 
significant degradation of tracking performance, is 
currently being investigated in light of the results 
from the minimization of the Hoc based control de- 
sign. 

To gain further insight into the neurocontroller 
characteristics, linearized small perturbation repre- 
sentations of the neyrocontroller were obtained at 
different time points along a trajectory correspond- 
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ing to a demanding set of tracking commands. A lin- 
ear analysis of these linearized neurocontroller models 
and the H<x based controller showed some differences 
in the controller characteristics. The major difference 
between the two controllers is that the H based 
controller is a “fixed” dynamic controller whose dy- 
namics are “automatically” determined through the 
synthesis procedure such that the specified criterion is 
met in the best possible manner, whereas the neuro- 
controller is an input-output mapping which is highly 
dependent on the magnitude and direction of the in- 
puts and any desired dynamic characteristics have to 
be built into the neurocontroller by appropriate selec- 
tion of inputs. For instance, both the iToo based con- 
troller and the neurocontroller have lead characteris- 
tics (rate feedback) from the tracking error measure- 
ments to the control commands; however, the lead 
characteristic was a result of the synthesis procedure 
for the Hoc based controller which used only errors as 
inputs, whereas for the neurocontroller this lead char- 
acteristic couid be obtained only by providing error 
rate as explicit inputs (measurements). Developing 
neurocontroi design methodologies that can synthe- 
size the dyn ami cs needed by the neurocontroller to 
achieve the desired performance is an area of future 
research. A possible approach may lie in the use of 
recurrent neural architectures. 

Linear stability robustness analysis tools were ap- 
plied to the linearized neurocontroller models and to 
the baseline H « based controller. These analysis 
tools showed that the neurocontroller will have very 
poor stability margins as compared to the Hoc based 
controller. The poor phase margins for the neuro- 
controller were confirmed in simulation wherein time 
delays of 0.05 sec in both control channels resulted in 
a limit cycie pitch response with the neurocontroller, 
while there was little performance degradation with 
the Hoc based controller. Since the issue of robust- 
ness is critical to practical implementation of flight 
control systems, a future area of research is to de- 
velop methodologies for the synthesis of robust neu- 
rocontrollers. and tools to analyze their robustness. 
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Figure 1.— Block diagram for K» control design. 
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Figure 3. — Closed-loop response and 



2?|o. 



Figure 4, — Training architecture. 
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Figure 5. — Neural emulation of fuel flow actuator dynamics. 
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Figure 6. — Neural emulation of thrust vectoring actuator dynamics. 



Figure 7. — Data paths of backpropagauon of the error during training. 



Figure 8.— Evaluation architecture of closed-loop neurocontroller. 
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Figure 10. — Closed-loop response and control < 
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Figure 15. — Closed-loop response to Case 2 commands with #me delay of 0.05 sec in both controls. 
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